Use of Linked Data principles for semantic management of scanned documents Emprego dos princípios Linked Data para gestão semântica de documentos digitalizados

نویسندگان

  • Luciane Lena Pessanha MONTEIRO
  • Mark Douglas de Azevedo JACYNTHO
چکیده

The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interoperabilidade e Portabilidade de Documentos Digitais Usando Oontologias

Our purpose is to enable interoperability of documents and achieve portability of digital documents through the reuse of content and format in different plausible combinations. We propose the characterization of digital documents using ontologies as a solution to the problem of lack of interoperability in the implementations of document formats. As proof of concept we consider the portability b...

متن کامل

Joint semantic discourse models for automatic multi-document summarization

Automatic multi-document summarization aims at selecting the essential content of related documents and presenting it in a summary. In this paper, we propose some methods for automatic summarization based on Rhetorical Structure Theory and Cross-document Structure Theory. They are chosen in order to properly address the relevance of information, multidocument phenomena and subtopical distributi...

متن کامل

Normalização Textual e Indexação Semântica Aplicadas na Filtragem de SMS Spam

Resumo—Nos últimos anos, a popularização dos celulares e smartphones impulsionou o uso de SMS como forma alternativa e barata de comunicação. O crescimento de adeptos ao serviço aliado a alta confiança que os usuários possuem nesses tipos de mensagens, vêm atraindo a atenção de pessoas e empresas mal intencionadas, conhecidas como spammers. O spam nesse contexto representa um problema para os m...

متن کامل

Un Sistema de Extracción de Información Basado en Ontologías para Documentos en el Dominio de las Tecnologías de Información An Ontology-Based Information Extractor for Data-Rich Documents in the Information Technology Domain

This paper presents an information extraction method, suitable for data-rich documents, based on the knowledge represented in a domain ontology. The extractor combines a fuzzy string matcher and a word sense disambiguation (WSD) algorithm. The fuzzy string matcher finds mentions of terms combining character-level and token-level similarity measures dealing with non-standardized acronyms and inc...

متن کامل

Processamento de consultas na Web de Dados: uma abordagem para busca de fontes de dados relevantes

The adoption of Linked Data principles has contributed towards the creation of a Web of Data, allowing the development of applications and tools which run queries over available information. One of the main challenges for the query processing over the Web is the selection of relevant sources, i.e., sources which could contribute significantly to the result of a query. In this paper, we discuss ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016